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Claims 

What is claimed is: 

A method for tracking a speaker in an audio source, said method comprising 

5 the steps of: 

identifying potential segment boundaries in said audio source; and 
clustering homogeneous segments from said audio source substantially 
concurrently with said identifying step. 

ICR 2. The method of claim 1, wherein said identifying step identifies segment 

■1= boundaries using a BIC model-selection criterion. 

]vl 3. The method of claim 2, wherein a first model assumes there is no boundary 

^ in a portion of said audio source and a second model assumes there is a boundary in said 

IS! portion of said audio source. 

tf 4. The method of claim 2, wherein a given sample, i, in said audio source is 

likely to be segment boundary if the following expression is negative: 
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20 where | S w | is the determinant of the covariance of the window of all n samples, |Sfj is the 

determinant of the covariance of the first subdivision of the window, and |SJ is the 
determinant of the covariance of the second subdivision of the window. 
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5. The method of claim 1, wherein said identifying step considers a smaller 
window size, n, of samples in areas where a segment boundary is unlikely to occur. 

6. The method of claim 5, wherein said window size, n, is increased in a 
relatively slow manner when the window size is small and increases in a faster manner when 
the window size is larger. 

7. The method of claim 5, wherein said window size, n, is initialized to a 
minimum value after a segment boundary is detected. 

8. The method of claim 2, wherein said BIC model selection test is not 
performed at the border of each window of samples. 

9. The method of claim 2, wherein said BIC model selection test is not 
performed when the window size, n, exceeds a predefined threshold. 

10. The method of claim 1, wherein said clustering step is performed using a BIC 
model-selection criterion. 

1 1 . The method of claim 10, wherein a first model assumes that two segments 
or clusters should be merged, and a second model assumes that said two segments or clusters 
should be maintained independently. 

12. The method of claim 11, further comprising the step of merging said two 
clusters if a difference in BIC values for each of said models is positive. 

13. The method of claim 1, wherein said clustering step is performed using K 
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previously identified clusters and M segments to be clustered. 

14. The method of claim 1, further comprising the step of assigning a cluster 
identifier to each of said clusters. 

15. The method of claim 1 , further comprising the step of processing said audio 
source with a speaker identification engine to assign a speaker name to each of said clusters. 

\$>S A method for tracking a speaker in an audio source, said method comprising 

the steps of: 

identifying potential segment boundaries in said audio source; and 
clustering segments from said audio source corresponding to the same speaker 
substantially concurrently with said identifying step. 

17. The method of claim 16, wherein said identifying step identifies segment 
boundaries using a BIC model-selection criterion. 

18. The method of claim 17, wherein a first model assumes there is no boundary 
in a portion of said audio source and a second model assumes there is a boundary in said 
portion of said audio source. 

19. The method of claim 16, wherein said identifying step considers a smaller 
window size, n, of samples in areas where a segment boundary is unlikely to occur. 

20. The method of claim 17, wherein said BIC model selection test is not 
performed where the detection of a boundary is unlikely to occur. 
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21. The method of claim 16, wherein said clustering step is performed using a 
BIC model-selection criterion, where a first model assumes that two segments or clusters 
should be merged, and a second model assumes that said two segments or clusters should be 
maintained independently. 

22. The method of claim 16, wherein said clustering step is performed using K 
previously identified clusters and M segments to be clustered. 

^ A method for tracking a speaker in an audio source, said method comprising 

the steps of: 

identifying potential segment boundaries during a pass through said audio 

source; and 

clustering segments from said audio source corresponding to the same speaker 
during said same pass through said audio source. 

24. The method of claim 23, wherein said identifying step identifies segment 
boundaries using a BIC model-selection criterion. 

25. The method of claim 24, wherein a first model assumes there is no boundary 
in a portion of said audio source and a second model assumes there is a boundary in said 
portion of said audio source. 

26. The method of claim 23, wherein said identifying step considers a smaller 
window size, n, of samples in areas where a segment boundary is unlikely to occur. 



27. The method of claim 24, wherein said BIC model selection test is not 

performed where the detection of a boundary is unlikely to occur. 
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28. The method of claim 23, wherein said clustering step is performed using a 

BIC model-selection criterion, where a first model assumes that two segments or clusters 
should be merged, and a second model assumes that said two segments or clusters should be 
maintained independently. 



29. The method of claim 23, wherein said clustering step is performed using K 

previously identified clusters and M segments to be clustered. 



A system for tracking a speaker in an audio source, comprising: 
a memory that stores computer-readable code; and 

a processor operatively coupled to said memory, said processor configured 
to implement said computer-readable code, said computer-readable code configured to: 
identify potential segment boundaries in said audio source; and 
cluster homogeneous segments from said audio source substantially 
concurrently with said identification of segment boundaries. 

An article of manufacture, comprising: 

a computer readable medium having computer readable code means embodied 

thereon, said computer readable program code means comprising: 

a step to identify potential segment boundaries in said audio source; and 
a step to cluster homogeneous segments from said audio source substantially 

concurrently with said identification of segment boundaries. 

^32^ A system for tracking a speaker in an audio source, comprising: 

a memory that stores computer-readable code; and 

a processor operatively coupled to said memory, said processor configured 
to implement said computer-readable code, said computer-readable code configured to: 
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identify potential segment boundaries in said audio source; and 
cluster segments from said audio source corresponding to the same 
speaker substantially concurrently with said identification of segment boundaries. 

An article of manufacture, comprising: 

a computer readable medium having computer readable code means embodied 

thereon, said computer readable program code means comprising: 

a step to identify potential segment boundaries in said audio source; and 
a step to cluster segments from said audio source corresponding to the same 

speaker substantially concurrently with said identification of segment boundaries. 

prf^ A system for tracking a speaker in an audio source, comprising: 

a memory that stores computer-readable code; and 

a processor operatively coupled to said memory, said processor configured 
to implement said computer-readable code, said computer-readable code configured to: 
identify potential segment boundaries during a pass through said audio 

source; and 

cluster segments from said audio source corresponding to the same 
speaker during said same pass through said audio source. 

An article of manufacture, comprising: 

a computer readable medium having computer readable code means embodied 
thereon, said computer readable program code means comprising: 

a step to identify potential segment boundaries during a pass through said 
audio source; and 

a step to cluster segments from said audio source corresponding to the same 
speaker during said same pass through said audio source. 
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